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1.  MAJOR  ACCOMPLISHMENTS 

The  major  accomplishments  of  the  project  include: 

•  studying  the  fundamental  ingredients  and  characteristics  of  a  service  migration  and  the  key 
system  properties  that  support  an  assured  service  migration; 

•  developing  a  formal  logic  for  service  migration  modeling,  survivability  policy  specification,  and 
system  property  verification; 

•  developing  a  viable  method  for  linking  survivability  constraint  solving  to  logic  reasoning; 

•  modeling  service  migration  and  studying  critical  factors  that  affect  an  effective  service 
migration; 

•  specifying  service  migration  semantics  and  constraint  solving; 

•  developing  a  belief-based  decision  approach  for  determining  service  migration  in  case  of  a 
security  incident; 

•  constructing  a  fuzz  inference  model  to  identify  a  service  migration  strategy; 

•  developing  a  logic  approach  for  service  migration  scheduling; 

•  developing  a  mobile  agent-based  scheme  for  service  migration  simulation  and  verification; 

•  proposing  a  constrained,  possibilistic  logic  approach  for  system  survivability  evaluation. 

2.  EXECUTIVE  SUMMARY 

Information  systems  have  been  continuously  used  in  many  high  security  and  high  integrity  settings  to 
support  our  society’s  critical  services  including  national  defense  and  homeland  security.  Any 
disruption  of  those  systems,  even  for  a  short  period  of  time,  could  result  in  severe  consequences.  In 
order  to  respond  to  security  incidents  and  survive  devastating  attacks,  a  critical  system  must  be  able 
to  adapt  to  its  operating  environments  dynamically.  The  approach  that  we  study  in  this  research  is  to 
equip  the  system  with  an  ability  to  migrate  the  critical  services  from  the  compromised  platforms  to 
other  clean,  healthy  platforms.  Service  migration  is  an  important  strategy  for  system  survivability.  In 
a  situation  where  component  replication  is  difficult  or  damage  masking  fails,  service  migration  is  a 
viable  solution  to  ensure  that  the  critical  services  can  be  continuously  provided  even  in  case  of 
malicious  attacks.  Conceptually,  service  migration  involves  suspending  the  current  service  state, 
moving  the  core  service  programs  and  other  trustworthy  space  to  other  platforms,  and  resuming 
where  computation  was  left  off  on  the  new  platforms.  Service  migration  helps  a  system  avoid  further 
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loss  in  case  of  a  security  incident  and  ensures  service  continuity  even  when  part  of  the  system  has 
been  damaged. 

In  this  research  we  aim  at  developing  a  logical  framework  of  service  migration  for  system 
survivability.  More  specifically,  we  have  (1)  developed  a  formal  logic  to  represent  and  reason  about 
system  properties  to  support  an  assured  service  migration,  (2)  specified  a  semantic  model  for  service 
migration,  (3)  modelled  service  migration  process  and  studied  the  critical  factors  that  affect  an 
efficient  service  migration,  and  (4)  developed  a  holistic  approach  for  service  migration  decisions 
including  three  decision  components  -  (a)  determining  whether  a  service  migration  is  the  most 
appropriate  course  of  action  to  take  in  case  of  a  malicious  attack;  (b)  specifying  the  best  strategy  for  a 
service  migration;  and  (c)  developing  an  effective  and  efficient  schedule  for  the  service  migration 
activities.  In  addition,  we  have  developed  an  agent -based  system  for  service  migration  simulation  and 
a  constrained,  possibilistic  logic  for  system  survivability  evaluation. 

2,1  A  Logical  Framework  for  Service  Migration 

In  developing  a  formal  logical  framework  to  represent  and  reason  about  system  properties  to  support 
an  assured  service  migration,  we  first  specify  an  abstract  system  model  which  lays  out  the 
architectural  foundation  to  model  various  components  of  a  service  migration  and  relocation. 
Stripping  details  of  a  system  and  its  properties  to  their  necessity  and  applying  formal  analysis  allow 
us  to  study  the  survivability  strength  and  criticality  of  the  system  components  and  their 
functional/security  properties  for  an  assured  service  migration.  In  the  system  architecture,  a 
migration -enabled  system  (denoted  as  SYS)  has  the  following  components  and  processes: 

•  PM  -  {pi,  p 2,  Pm}-  a  set  of  distributed  computing  platforms,  where  each  pi  can  support  a  set  of 

services  as  represented  by  its  capability  set  Abiipi)', 

•  SV  -  {5/,  S2,  5„}:  a  set  of  services  supported  by  the  platforms  of  SYS.  The  notation  Ex{sj,  p,)  is 

used  to  represent  that  a  service  sj  is  executed  on  platform  p,  during  a  particular  time  period; 

•  Migration  Manager  (MM):  A  component  of  SYS  which  coordinates  the  service  migration  activities. 
As  part  of  MM,  a  scheduler  executes  a  function  Chooseipi,  sj,  pk)  to  generate  a  service  migration 
arrangement  for  each  service  sjgSV  to  be  migrated  from  its  current  platform  p,  to  a  new  healthy 
platform  pt^PM.  Service  migration  is  necessary  in  case  of  a  detection  of  severe  damage  to  the 
platform  p„  or  the  current  platform  cannot  satisfy  the  security  requirements  of  the  services  given  the 
changed  operating  environment; 

•  M{si,  Pi,  Pk)  =  Ex{sj,  Pi)  ->  Ex{sj,  Pk):  a  service  migration  process  that  suspends  the  current  service  sj 
on  p„  moves  its  core  programs  (and  other  trustworthy  space)  to  a  new  platform  pk,  and  resumes  where 
service  was  left  off  on  the  new  platform; 

•  R{sj,  Pk,  Pi)  =  Ex(sj,  Pk)  ->Ex(sj,  Pi):  a  service  relocation  process  in  which  the  service  sj  is  transferred 
from  the  migrated  platform  pk  to  its  original  platform  p,  after  p,  has  been  recovered  from  the  damage 
and  the  operating  environment  has  improved. 

We  have  specified  the  major  activities  and  the  timeline  of  a  service  migration/relocation  process 
as  shown  in  Figure  1.  The  entire  process  is  triggered  by  an  event  such  as  the  detection  of  severe 
damage  to  platform  p,.  The  first  step  is  for  the  scheduler  to  generate  a  feasible  arrangement  for  each 
critical  service  sj  currently  executed  on  p,  to  be  migrated  to  a  healthy  platform  pk.  In  the  meantime,  Sj 
is  halted,  e.g.,  freezing  service  processes,  recording  global  data  (service  configuration  and  state), 
recording  the  states  of  individual  processes,  and  terminating  the  entire  service  program.  The 
migration  process  M(sj,  pi,  pk)  starts  immediately  when  the  alternative  service  platform  is  determined 
and  Sj  is  appropriately  halted.  M(sj,  pi,  pk)  is  composed  of  three  sub-actions:  (1)  migration  preparation; 
(2)  service/data  transfer;  and  (3)  service  setup  on  pk.  The  service  sj  may  be  executed  on  the  new 
platform  pk  until  its  completion.  For  a  long-running  service,  however,  if  the  previously  damaged 
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platform  p,  has  been  recovered,  a  relocation  process  R(sj,  pk,  pd  may  transfer  Sj  back  to  p,  where  it  can 
be  completed.  R{sj,  pk,  pd  is  composed  of  three  sub-actions  similar  to  M(sj,  pi,  pk). 


To  ensure  those  migration  activities  are  carried  out  correctly,  we  must  specify  the  requirements 
for  the  key  system  and  service  properties,  which,  if  satisfied,  will  provide  an  assured  service 
migration.  A  formal  logic  has  been  developed  for  system  activity  specification  in  which  important 
service  characteristics  are  preserved  during  and  after  a  migration.  Our  logical  framework  provides 
means  to  represent  and  verify  that  a  system  with  the  required  properties  satisfies  a  user’s  policy  in 
terms  of  the  desired  survivability  objectives.  We  specify  a  set  of  requirements  on  system/service 
properties  as  domain  specific  constraints.  Logic  reasoning  and  constraint  solving  are  separately 
designed  but  integrated  through  the  applications  of  the  logic  inference  rules. 


2.1.1  The  Logic 

In  our  logic,  time  is  represented  by  points  on  the  real  line  and  durations  are  intervals  on  the  real  line. 
Causality  among  sequences  of  activities  is  captured  through  implications  between  formulas.  The  basic 
types  of  the  logic  language  include  entities  (e.g.,  a  platform  p,  and  a  service  sj),  actions  (Acts),  events 
(^vts),  time  points,  time  intervals,  and  various  system  properties.  As  a  schedulable  work,  an  action 
represents  an  activity  or  a  set  of  activities  in  a  service  migration/relocation  process.  An  event  is 
defined  as  a  temporal  marker  which  occurs  at  a  certain  time  point.  For  each  action  Act,  two  time 
points  are  implicitly  defined,  denoted  as  Act]  and  Act],  which  represent  the  starting  and  ending  time 
points  of  Act,  respectively.  We  use  @Evt  to  represent  the  time  point  when  an  event  Evt  occurs  and 
Du(Act)  the  duration  of  the  action  Act. 

The  formulas  and  connectives  to  form  the  logic  are  presented  next,  where  P  represents  an  atomic 
formula  over  an  action  or  an  event;  A  and  B  represent  atomic  or  compound  formulas;  v  is  a  variable 
with  a  sort  (type)  t;  C  represents  a  constraint;  and  Pro  represents  system/service  properties  that  must 
hold  during  and  after  a  service  migration.  If  a  formula  represents  an  action/event,  the  formula  is  true  if 
the  represented  action/event  is  (or  can  be)  successfully  finished. 

(Formulas)  A,  B  ::=  P  \  A  A  B  \  A  V  B  \  A  \  A  B  \  A  B  \  A  #  B  \  A  ]  B  \  A  «^C  \  C  »cA  \  V  v:t.  A 
I  3v:t.A 


(Atomic  formulas) 

P 

.-.•=  Act  1  Evt 

(Actions) 

Act 

::=  Execution  1  Scheduling  \  Halting 

(Events) 

Evt 

Damage _Detection  \  Choose  \  ... 

(Constraints) 

C 

::=  Exp  1  Exp@Evt  1  Exp<Ti,  T2> 

(Constraint  expressions) 

Exp 

::=  CT\CTiOpCT2 

(Constraint  terms) 

CT 

:;=  Pro  1  T 1  Func  1  Exp 

(Constraint  functions) 

Func 

::=  Du(Ac)  1  Remaining{sj)  |  ... 

(Time  points) 

T 

•;=  cl  @Evt  1  Act]  Act] 

(Operators) 

Op  : 

<  1  <  1  >  1  =  1  ... 

(Properties) 

Pro  ; 

=  Healthyipi)  1  Service_level(sj,  pi)  \  . . 

(Entities)  Platforms  ;;=  pi\pk\... 

(Entities)  Services  ::= 
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The  meanings  of  the  logic  connectives  are  presented  below  and  their  semantics  are  formally 
described  by  the  inference  rules. 

A  B:  formula  A  implies  formula  B  to  be  true  in  the  future,  i.e.,  the  action/event  represented  by  formula 
B  will  start  sometime  in  the  future  relative  to  the  time  when  the  action/event  represented  by 
formula  A  finishes; 

A  -^oB:  formula  A  implies  formula  B  to  be  true  in  the  next  state,  i.e.,  the  action/event  represented  by  A 

causes  B  to  occur  in  the  next  state  relative  to  the  time  when  the  action/event  represented  by  A 

finishes; 

A  J  5:  the  actions/events  represented  by  A  and  B  are  executed  concurrently  and  start  at  the  same  time; 

A#B\  the  actions/events  represented  by  A  and  B  are  two  sequential  sub-components  of  a  compound 

action/event; 

C  »c  A:  constraint  implication  —  it  introduces  a  pre -constraint  C  to  formula  A; 

A  «c  C:  constraint  conjunction  —  it  asserts  the  validity  of  a  post-constraint  C  of  formula  A. 

Our  logic  is  designed  to  represent  a  service  migration/relocation  process  and  to  specify  the 
necessary  system  properties  to  support  an  assured  service  migration.  We  describe  the  actions/events 
and  their  inherent  relationships  as  a  set  of  system  transition  rules  (TRs).  Each  transition  rule 
describes  the  required  system/service  behaviors  in  a  service  migration/relocation  process.  If  a  system 
is  developed  with  those  transition  rules  and  the  necessary  properties,  users  can  be  certain  that  the 
system  satisfies  the  survivability  policy.  In  our  research,  we  have  identified  ten  transition  rules  and 
only  show  first  two  rules  in  this  report. 

TRi.  {DD{sj,pi)  (P{sj)  >L*)@DDisj,pd  »,Hisj)tS(pi,  sj)) 

Transition  rule  TRi  represents  the  temporal  relationships  among  DD(sj,  pi),  H(sj)  and  S(pi,  sj)  as  well  as  a 
pre -constraint  for  service  halting  and  migration  scheduling  —  immediately  after  the  execution  of  a  critical 
service  sj  on  platform  p,  is  signaled  to  stop  (i.e.,  DD(sj,  pi)),  a  scheduling  program  (i.e.,  S(pi,  sj))  is  executed 
in  order  to  identify  a  healthy  platform  for  sj  to  migrate  to.  In  the  meantime,  sj  is  appropriately  halted  (i.e., 
H(sj)).  Since  those  two  actions  are  performed  concurrently,  they  are  represented  by  a  compound  formula 
H(sj)'lS(pi,  Sj).  The  pre -constraint  of  H(sj)XS(pi,  sj)  is  that  the  priority  of  sj  must  be  greater  than  L*  (i.e.,  P(sj)> 
L*). 

TR2:  S(pi,  Sj)  -^o  3  pk.  {Cipi,  Sj,  pk)  -^o  Shipi,  Sj,  pk)  «c  {{sj&Abiipk)  A  SL{sj,  pk)  >  L)@C(pi,  Sj,  pk)  A 

He(pk)@C(pi,  Sj,pk))) 

Transition  rule  TR2  indicates  that  if  the  Choose  function  (i.e.,  Cipt,  Sj,  pk))  identifies  a  new  platform  pk  for 
service  Sj  to  migrate  to  as  a  result  of  the  scheduling  process  (i.e.,  S(pi,  Sj)),  Sj  is  scheduled  to  migrate  from  p, 
to  Pk  (i.e.,  Shipi,  Sj,  Pk)).  The  post-constraints  of  Shtpi,  Sj,  pk)  are:  (1)  pk  can  fulfill  the  necessary  functions  of 
Sj  (i.e.,  Sj&Abiipk))',  (2)  the  new  platform  pk  is  healthy  at  the  time  of  the  scheduling  decision,  i.e., 
He(pk)@C(pi,  Sj,  Pk)',  and  (3)  the  service  level  of  Sj  on  the  new  platform  pk  will  be  maintained  at  least  at  a 
level  of  L,  i.e.,  (SLisj,  pk)  >  L)@C(pi,  Sj,  pk). 

The  inference  rules  in  our  logic  are  represented  using  sequent  calculus.  We  start  from  a  sequent 
(hypothetical  judgment)  with  the  format  Xi  T  =>  fA(j,  where  X  represents  a  context  specifying 
the  sort  (or  type)  of  each  term  or  variable  appearing  in  a  formula,  T'  represents  a  set  of  assumptions 
in  terms  of  constraints  (called  assumption  constraints),  T  represents  a  set  of  hypothetic  logic 
formulas,  P  represents  a  conclusion  formula,  and  CT  represents  a  set  of  constraints  to  be  satisfied  or 
solved  (called  goal  constraints).  The  sequent  is  interpreted  as:  given  all  the  variables  defined  in  X,  if 
the  assumption  constraints  in  T'  and  the  logic  hypothesis  in  T  are  assumed  to  be  true,  then  we  can 
prove  the  logic  goal  P  subject  to  the  satisfaction  of  all  the  goal  constraints  in  Cj. 

The  constraint  formulas  in  T'  represent  the  required  system/service  properties  such  as  system 
support  for  a  service  migration  (e.g.,  the  pre-  and  post-constraints  of  the  activities  in  a 
migration/relocation  process),  the  service  quality  level  on  a  platform,  and  the  temporal  restrictions  on 
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system/service  actions  and  events  (e.g.,  the  time  bound  to  complete  an  activity).  GT  is  the  conjunction 
of  all  the  constraints  that  must  be  satisfied  given  the  constraint  assumptions  in  GT  is  specified  for  the 
entire  proof  process  and  hence  it  is  checked  in  the  last  step. 

r  includes  two  categories  of  hypotheses:  (1)  the  system  transition  rules  which  model  sequences  of 
service  migration/relocation  actions;  and  (2)  a  set  of  “known  facts”  such  as  a  damage  detection  on  a 
platform  at  a  certain  time  point  (modeled  as  time  zero),  an  arrangement  for  a  service  to  migrate  to  a 
new  platform,  and  a  notification  of  a  compromised  platform  being  repaired.  Those  formulas  are  used 
as  assumptions  for  logic  reasoning. 


checH'Vu'Vi)  T:  ^ i;  T  =>  A\Cji  T,;  ^2;  T.  B  =>  i3\Cf2  (- 

X;  'Piu'F2,  r,  a  ^  S  =>  p\(CjiuGf2) 

y;  'P;  r.  A.  B  =>  (g\Cj  (ALrule) 

X;  'P;  r,AAS=>^\Gr 

check('i'.(A~oB))  V:  T,  A-^B  =>  o\C!'  (^oL  rule) 

y;  odmitQV ,(A~oB));  F,  A  — >o  B  =>  (p\(j 

check('i‘.(A~oB))  T:  F.  A-^B  =>  i3\Cf  (-^nL  rule) 

y;  admit('i‘,(A~oB));  F,  A  B  =>  (p\(j 


y;  'P;  F,  C  »cA  =>  (pVidmitiC§,  C) 


y;  admit('¥,  C);  F,  A  «c  C  =>  (p\(j 


(»c  L  rule) 


(«cL  rule) 


checki'VXAm)  T:  'i':  F.  A^o  B  =>  (g\Ci'  (#Lrule) 
y;  admit('¥,{A#B));  F,  A#B  =>  ^\Cf 

checki^JAtB))  T:  >P:  F.  AAB  =>  o\Ci'  (JLrule) 
y;  admitQi'XAtB));  F,  AJfi  =>  ^\Gr 

y  |-  v:t  Y;  *P;  F.  Iv/xIA  =>  ft)\Gf  ( V  L  rule) 

y;  F,  Vx;f.  A  =>  ,^\Gf 

y  |-  v:t  Y:  *P:  F,  Iv/xIA  =>  ft)\Gf  (]  Lrule) 
y;  'P;  F,  3x:t.A=>(p\6 

check('i‘.(A't=  tj.  AJ,-  t2))  Y:  'P:  F.  A  =>  (g\Cj  (oLrule) 
y;  admit('i‘,(A]=  ti,  A  J,=  f2));  F,  A<f;,  t2>  =>  (p\kj 


;y;F=>A\Cf 

y;r=>A 


(Constraint  Solving  rule) 


Y;  y;  F.A=>M(j 
y;  'P;  F  =>  (A  ^B)\Cj 


(-^R  rule) 


check{'Vu'¥2')  Y.;'Pi;F=>A\Gfi  Y;y2;F=>M(j2  ( Aft  rule) 

y;  'Piu'P2,  F  =>  (AAS)\(GfiuCj2) 


_ Y:  'P:  F  =>iA  B)\(X _ 

y;  'P;  F  =>  (A  B)\admit((S,(A~oB)) 

_ Y;  'P;  F  =>(A  ft)\Cj _ 

y;  'P;  F  =>  (A  B)\admit((S,(A~aB)) 


(—foR  rule) 


(^D  ft  rule) 


y;  admiti'V,  C);  F  =>  (C  »cA)\Cj 

var(0  Y;  >P;  F  =>A\Cf 
y;  'P;  F  =>  (A  «c  C)\admit((X,  Q 

_ Y:  'P:  F  =>(A^oft)\Cj 

y;  'P;  F  =>  {A#B)\admit(k§,{A#B)) 

y:'¥:  F  =>  (AAg)\Gf 
y;  F  =>  (AtB)\admit(Cj,(AtB)) 

y\-v.t  Y;  y;  F  =>(lv/x1A)\Cf 
y;  'P;F=>(Vx.-f.A)\Cj 


(»c  ft  rule) 


(«c  ft  rule) 


(#ft  rule) 


(tft  rule) 


(Vftrule) 


Y  |-v:f  Y;  >P:  F  =>  (1v/x1A)\Cj  (3ft  rule) 
y;  'P;F=>(3x.-f.A)\Cj 

(x],  X2.  ...  Xn)  =  (vi.V2.  ...  Vn)8  (Initial  rule) 
y;  *Po;  F, p(xi,...  x„)  => p(yi,...  yn)\0 


Fig.  2:  Logic  Inference  Rules  in  Sequent  Calculus 

The  inference  rules  of  the  logic  are  represented  in  Figure  2,  where  (p  represents  an  arbitrary 
formula;  where  X  |-  var(C)  means  that  the  variables  in  formula  C  have  the  appropriate  sorts  (types)  as 
defined  in  the  context  X;  and  where  check,  admit  and  |-  are  constraint  functions.  Basically,  check 
verifies  whether  a  constraint  is  admissible  to  an  assumption  constraint  set  T*  or  whether  two 
constraint  sets  T'l  and  ^*2  can  be  combined.  Intuitively,  a  constraint  C  is  admissible  to  T'  if  the 
addition  of  C  to  T'  will  not  cause  any  inconsistency  between  C  and  the  existing  constraints,  admit 
adds  a  new  constraint  to  T'  or  Gr.  T*  |-  Gr  is  to  verify  whether  all  the  constraints  in  Gr  can  be  solved 
given  the  constraints  in  T'. 

The  proof  process  starts  with  the  Constraint  Solving  rule.  This  rule  states  that  in  order  to  prove  a 
goal  formula  A  without  constraint  verification  (i.e.,  T  =>  A),  we  need  to  show:  (1)  A  is  provable 
with  the  assumption  constraints  T'  and  the  goal  constraints  GT  (i.e.,  T  =>  A\Gl),  and  (2)  every 

goal  constraint  CeGr  is  solvable  given  the  constraints  in  T'  (i.e.,  T'  |-  Gr).  By  solving  constraint  C,  we 
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mean  that  the  constraints  in  ^  lead  to  a  resolution  that  C  is  true.  The  Initial  rule  states  that  given  the 
initial  known  constraints,  T'o,  we  can  prove  a  formula p(yi,...  y„)  if  p{xi,...  x„)  is  assumed  true  and 
(yi,...  jn)  is  unified  with  (xi,...  x„)  through  the  most  general  unifier  9.  There  is  no  goal  constraint  for 
the  Initial  rule  (i.e.,  Gr  =0). 

2.1.2  Constraint  Solving  and  Proof  Search 

To  capture  the  complexity  of  a  service  migration  and  the  properties  of  a  system  to  support  an  assured 
service  migration,  we  propose  to  integrate  the  domain  constraints  with  logic  reasoning  to  include  an 
efficient  constraint  solver  with  such  properties  as  consistent  and  complete  and  a  decision  procedure 
capable  of  solving  a  set  of  constraints  in  an  effective  manner.  The  logic  engine  and  the  constraint 
solver  are  separately  designed  but  integrated  through  the  applications  of  the  logic  inference  rules. 
This  allows  one  to  represent  and  reason  the  important  features  of  a  service  migration  implemented  by 
different  techniques.  Separation  of  logic  and  the  constraint  domain  will  make  the  logical  framework 
more  modular,  scalable,  and  applicable  to  a  wide  range  of  applications. 

The  constraint  functions  verify  whether  some  constraints  can  be  satisfied  (i.e.,  T'  |-  Gr)  or  admitted 
to  a  constraint  set  (e.g.,  check{'V,  C),  admit{^,  Q,  adniit{(j,  C)).  Since  constraint  checking  and 
solving  are  driven  by  a  logic  engine  through  the  inference  rules  during  a  proof  process,  we  have 
developed  an  algorithm  to  explicitly  describe  the  interactions  between  the  constraint  manager  and  the 
logic  engine.  The  three  major  tasks  as  the  main  components  of  the  algorithm  are  discussed  next. 

Constraint  Variable  Unification  When  a  logic  inference  rule  is  applied,  a  logic  variable  may  be 
unified  with  a  constant  or  another  variable.  Since  logic  variables  may  appear  in  the  constraint  terms 
in  T'  and  GT,  the  unifier  should  be  propagated  to  the  constraint  variables  as  well.  The  algorithm 
maintains  the  most  general  verifier  and  applies  it  to  each  constraint  term.  A  basic  unifier  is 
determined  when  the  Initial  rule  is  applied,  i.e.,  a  logic  goal  formula  (e.g.,  p(yi,...  yn))  is  unified  with 
a  logic  assumption  (e.g.,  p{xi,.. .  Xn))  in  T. 

Constraint  Checking  and  Admission  Constraint  checking  is  to  verify  if  a  constraint  C  can  be 
admitted  to  the  assumption  constraint  set  T'  (i.e.,  checkQV,  C))  or  two  assumption  constraint  set  T*! 
and  T'2  can  be  combined  without  conflict  (i.e.,  checkifPi,  T'2)).  Intuitively,  checkQi',  C)  is  to  verify  if 
a  prospective  requirement  (represented  by  C)  for  a  system  property  or  a  time  bound  on  an  action  can 
be  assumed  in  a  logic  reasoning  process  for  an  assured  service  migration  given  some  existing 
constraint  assumptions.  A  new  constraint  C  is  admissible  to  T'={Ci,  C2,  ...,  C„}  if  there  is  no  conflict 
in  assigning  values  to  the  free  variables  in  Ci,  C2,  ...,  Cn  given  the  quantitative  constraints  expressed 
by  C.  If  indeed  there  is  no  conflict,  adniiti^,  C)  returns  a  new  assumption  constraint  set  with  C  added. 
Otherwise,  checkQV,  C)  returns  false. 

Constraint  Solving  Constraint  solving  is  to  solve  all  the  goal  constraints  Gr  given  the  assumption 
constraints  in  T'  (i.e.,  T'  |-  GT).  By  solving  a  constraint  CeGr  (i.e.,  T'  |-  C),  we  mean  to  check  if  (1) 
every  variable  in  C  has  been  resolved  to  a  ground  term;  and  (2)  the  constraint  equation/inequality 
relationship  represented  by  C  holds  given  the  existing  constraints  in  T'.  A  constraint  manager  reduces 
constraint  solving  to  a  multi-criteria  linear  equations/inequalities  checking  problem.  Constraint 
solving  is  the  last  step  in  a  proof  process  when  the  constraint -unverified  sub-sequent  (e.g.,  Xi  T 
=>  A\Gl)  has  been  proved.  Constraint  solving  makes  sure  that  all  the  goal  constraints  in  Gr  can  be 
satisfied  given  the  assumption  constraints  in  T'. 

A  survivability  policy  specifies  a  user’s  requirements  for  the  survivability  features  of  a  system.  In 
our  framework,  such  a  policy  is  represented  in  terms  of  a  set  of  temporal  and  functional  properties 
that  the  system  must  have  in  order  to  support  an  assured  service  migration.  If  a  system  possesses 
those  properties,  it  essentially  guarantees  that  the  critical  services  can  be  dynamically  reallocated 
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from  a  compromised  platform  to  other  healthy  ones.  Therefore,  the  entire  system  can  go  through 
malicious  attacks  and  continuously  provide  mission-critical  services.  If  that  is  the  case,  we  say  the 
system  satisfies  the  user’s  survivability  policy.  As  a  generic  case,  the  service  migration-based 
survivability  policy  can  be  specified  from  three  aspects  as  represented  by  the  following  logic  goal 
statements: 

(1)  Gi(Soundness):  p)  «c{He(p)  ^SL{sj,  p)  >  L)) 

A  service  Sj  will  be  eventually  completed  on  a  healthy  platform  p  with  a  service  level  at  least  L  — 
either  on  its  originally  executed  platform  p,  or  the  migrated  platform  pt  as  long  as  the  platform  is 
damage-free  (healthy). 

(2)  G2  {Efficiency)-.  {-^o{H{sj)XS{pi,  Sj)  -^o  M{sj,  pi,  pu)  R(sj,  pt,  p,))  «c  {Du{H{sj)XS{pi,  Sj))+Du{M{sj, 
Pi,  pk))+Du{R(sj,  pk.  Pi))  <  D  max) 

The  total  time  spent  on  service  scheduling  Sipu  sj),  halting  H{sj),  migration  M{sj,  pi,  pk),  and  relocation 
R{sj,  Pk,  Pi)  must  not  be  more  than  the  maximum  allowable  time  Dmax- 

(3)  Gs  {Integrity)-.  -x/Risj,  pk, pi).  {3M{sj,  pi,  pk).  {M{sj,  pi,  pk)  R{sj,  pk,  pd)) 

For  every  relocation  process  R{sj,  pk,  pi),  there  must  exist  a  migration  process  M{sj,  pi,  pk)  that 
occurred  earlier. 

To  verify  that  a  system  satisfies  the  survivability  policy  as  specified  by  the  above  three  goal 
statements,  it  is  only  necessary  to  find  a  proof  for  “X;  T  =>  (G/  A  G2  A  G^)”. 

A  proof  search  is  to  identify  a  derivation  of  a  goal  statement  from  a  list  of  hypothetical 
assumptions  subject  to  a  set  of  constraints  by  applying  a  set  of  inference  rules.  A  proof  is  logically 
viewed  as  a  tree  rooted  by  the  conclusion  sequent  (e.g.,  X;  T  =>  {Gi  A  G2  A  Gs)  as  shown  above), 
where  the  leaf  sequents  are  all  axioms  and  each  non-leaf  sequent  is  derived  from  its  premise  sequents 
by  a  rule  application.  The  proof  search  is  syntax-driven  by  following  the  logic  inference  rules.  Each 
application  of  an  inference  rule  reduces  a  sequent  matching  the  conclusion  of  the  rule  to  the  premises 
of  the  rule  (i.e.,  sub-sequents).  A  branch  of  the  proof  is  successfully  terminated  when  the  formula  to 
be  proved  unifies  with  a  formula  in  the  hypothesis  set  T.  The  resulting  unifier  is  propagated  to  the 
next  remaining  premise  (including  the  constraint  formulas  in  T'  and  Gr  as  we  discussed  earlier)  and 
the  process  is  repeated.  The  proof  search  follows  the  following  rules:  (1)  if  every  leaf  node  is  an 
instance  of  an  axiom,  i.e.,  X;  ^0;  T,  p{xi,...  x„)  =>  p{yi,...  yn),  the  proof  search  has  terminated 
successfully;  (2)  if  some  leaf  contains  no  logic  connectives,  but  is  not  an  instance  of  any  axiom,  then 
the  search  has  terminated  unsuccessfully;  and  (3)  if  a  leaf  contains  some  logic  connectives,  a  search 
step  may  choose  one  connective  and  apply  the  corresponding  inference  rule  to  reduce  the  proof  of  the 
conclusion  to  its  premises. 

2,2  Service  Migration  Modeling  and  Critical  Factors  that  Affect  an  Effective  Service  Migration 

To  quantify  the  important  factors  that  affect  an  effective  and  efficient  service  migration/relocation, 
we  have  developed  a  simulation  model  to  represent  the  activities  and  behaviors  of  system 
components  in  a  service  migration/relocation  process.  The  model  is  encoded  in  the  Performance 
Evaluation  Process  Algebra  (PEPA).  PEPA  introduces  delays  and  probabilistic  occurrences  to 
process  algebras.  The  timing  behavior  of  a  system  is  quantified  by  associating  a  random  variable 
with  each  activity,  representing  its  duration.  Behavior  uncertainty  is  determined  by  probabilistic 
branching  -  the  probabilities  of  the  occurrence  of  some  activities  are  determined  by  a  race  condition 
between  the  enabled  activities.  In  our  model,  the  service  migration  and  relocation  activities  are 
represented  as  stochastic  actions  that  are  non-deterministic  and  whose  occurrence  or  non-occurrence 
is  predicted  by  one  or  more  random  variables  (i.e.,  activity  rates).  A  system  SYS  is  modeled  as 
interactions  between  the  service  migration/relocation  components  (i.e.,  a  migrating  scheduler,  a 
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platform  to  support  a  critical  service,  and  a  relocation  manager)  and  the  damage  recovery 
components  (i.e.,  a  fault  diagnosing  agent  and  a  damage  repairer). 

The  PEPA  model  representing  SYS  is  shown  in  Figure  3.  As  we  can  see,  the  model  has  11 
processes  (components)  representing  a  complete  procedure  for  service  sj  to  be  migrated  from  a 
compromised  platform  p,  to  a  new  platform  pk  and  finally  relocated  from  pt  to  pt  after  pi  is  recovered. 
The  model  also  includes  the  recovery  procedure  of  the  compromised  component. 


Executioriij 

= 

(monitor_anomaly,  pi).(alarm,  al).Contingencyi  +  (monitor_normal,  p2).  {executingij, 
f).  Executionij; 

Contingencyi 

{investigate_damaged,  {it  *p3)).{recovery_notifyi,  tt).Migration_Manager  + 

{investigate _self_contain,  {it  *p4)). Executionij; 

Migration_Manager 

{halting ij,  h)  MigrationJScheduler; 

Migration_Scheduler 

{schedule _ok,{st  *p5)).Migrationi  *  +{scheduleJailure,{sPp6y).Migration_Scheduler; 

Migratioiii  k 

{m_Pre,  mi).{m_Tr,  m2).{m_Su,  m3). Executionij; 

Executiorikj 

= 

{recovery _check_ok,  p7).{recoveredi,  T).Relocation_Manager  + 

{recovery _check _pending,  ps). {executingij,/). Executionij; 

Relocation_Manager 

{halting i j,  h).Relocationi 

Relocatioiik  i 

{r_Pre,  ri).{r_Tr,  r2).{r_Su,  r3).Executioni j; 

Recovery  _Manager 

{recovery _notifyi,  T). Recovery  j 

Recoveryt 

{diagnose,  dt).Repairer; 

Repairer 

{repair_success,  (/  *p9)).{recoveredi,  rp). Recovery _Manager  +  {repair Jail,  {l*pio)). 
Recoveryi, 

SYS 

Cx] 

Executionij  z.  Recovery_Manager 

(L  =  {recovery  _ 

notify i,  recoveredi]) 

Fig.  3:  PEPA  Model  of  Service  Migration  and  Relocation 

The  PEPA  model  has  been  solved  using  the  PEPA  Eclipse  Plug-in  software.  We  have  developed 
a  Bayesian  network  decision  model  to  determine  the  activity  rates  used  in  the  model.  Several  rounds 
of  simulations  were  conducted  for  steady-state,  utilization,  passage-time,  throughput,  and 
experimentation  analysis,  in  order  to  study  how  important  factors  influence  the  effectiveness  and 
efficiency  of  a  service  migration/relocation  process.  Those  analyses  are  summarized  below. 

Steady-state  Probabilities,  Local  State  Utilization,  and  Activity  Throughput  For  a  PEPA 
simulation  execution,  the  system  states  corresponding  to  the  underlying  continuous  Time  Markov 
processes  are  derived  and  the  probability  of  the  system  at  each  state  is  generated.  Our  PEPA  model 
has  33  (global)  states.  Since  the  model  has  two  top-level  PEPA  processes:  Executiomj  and 
Recovery _Manager,  a  (global)  state  has  two  elements,  one  from  each  local  state  of  the  corresponding 
top-level  processes.  Our  simulation  shows  that  Executioriij  has  17  local  states  and 
Recovery _Manager  has  4.  The  PEPA  states  with  dominating  steady-state  probabilities  are  those 
associated  with  the  two  local  states  executingij.Executiomj  (0.891)  and  Recovery _Manager  (0.981). 
This  has  also  been  observed  from  our  utilization  analysis,  which  shows  the  long-run  utilization  of 
each  top-level  process  of  the  PEPA  model.  Since  executingij.Executiornj  represents  the  normal 
execution  of  service  Sj  on  its  original  platform  pt,  maximizing  the  utilization  of  this  state  is  the 
objective  of  an  efficient  service  migration  and  relocation.  We  use  Pro  to  represent  this  utilization  rate. 

Activity  throushput  A  throughput  analysis  lists  the  rate  at  which  actions  of  the  PEPA  are 
performed  at  steady-state.  The  two  PEPA  activities  with  the  highest  throughputs  are  executingij 
(0.09)  and  monitoring _normal  (0.09).  The  former  indicates  that  service  sj  is  executing  on  platform  pi 
and  hence  a  higher  value  is  more  desirable.  The  latter  indicates  that  the  intrusion  detection  system 
reports  system  normal  operations  in  most  cases  (i.e.,  no  suspicious  behaviors  are  detected).  Just  as 
the  steady-state  analysis  focuses  on  the  local  state  executingij.Executionij,  the  throughput  of 
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executing ij  represents  the  desired  behavior  of  sj  on  p,;  therefore,  it  is  another  metric  in  which  we  are 
interested  in  the  research. 

Experimentations  We  run  the  PEPA  model  with  values  for  its  parameters  across  desired  ranges. 
The  experimentations  are  to  study  how  system  security/functional  factors  affect  the  utilization  rate  of 
executingij.Executionij,  i.e.,  Pro  and  the  throughput  of  executingij  as  mentioned  above.  We  start 
with  the  two  factors  determined  by  the  security  features  of  the  system  SYS:  (1)  the  probability  of 
anomaly  detection  on  a  platform  pi  in  SYS,  i.e.,  Proj  in  an  intrusion-detection  report  cycle;  and  (2) 
the  probability  that  the  detected  damage  is  severe,  i.e..  Pros.  Intuitively,  if  a  system  has  strong 
security  mechanisms  and  a  high  level  of  capability  to  contain  and  mask  potential  damage,  then  the 
need  for  the  critical  services  to  be  migrated  from  their  normal  executing  platforms  would  be  low. 
Hence,  the  utilization  of  executingij.Executioni  j  should  be  higher.  The  experimentations  confirm  this 
observation,  i.e..  Pro  decreases  when  Proi  and  Pros  increase.  This  clearly  indicates  that  a  higher 
compromise  rate  on  a  platform  decreases  the  amount  of  time  that  the  platform  effectively  supports 
the  critical  services.  Furthermore,  we  have  identified  that  the  quantitative  relationship  between  Pro 
and  Proi  is  roughly  linear  given  a  fixed  Pros  rate.  This  implies  that  a  significant  improvement  of  the 
system’s  security  will  result  in  an  almost  equal  increase  in  the  normal  execution  of  critical  services 
on  their  original  platforms.  A  similar  pattern  can  be  observed  for  the  throughput  of  executingij  given 
different  Proi  and  Pro2  values. 

As  we  have  discussed  earlier,  the  Migration  Manager  is  responsible  for  halting  a  critical  service 
on  a  compromised  platform,  scheduling  and  arranging  a  new  platform  for  the  service  to  be  migrated 
to,  moving  the  data  and  program  space  of  the  service  to  the  new  platform,  and  finally  setting  up  the 
service  on  the  new  platform.  In  the  meantime,  the  system  component  Recovery  Manager  diagnoses 
the  faults  and  attempts  to  repair  the  compromised  platform.  The  performance  of  those  two  system 
components  affects  a  service  migration.  Our  simulation  shows  that  a  higher  probability  of  a 
successful  migration-scheduling  rate,  and  a  higher  probability  of  a  successful  repair  of  a 
compromised  platform,  both  positively  affect  the  utilization  of  executingij.Executionij,  i.e..  Pro. 
This  indicates  that  effective  damage  recovery  and  highly  available  healthy  platforms  increase  the 
overall  efficiency  of  a  service  migration,  which  in  turn  increases  the  percentage  of  time  that  the 
critical  services  are  executed  on  their  normal  platforms.  However,  Pro  becomes  stable  once  the 
possibility  of  a  new  platform  being  identified  at  the  time  of  migration  scheduling  reaches  a  certain 
value  (0.1  in  our  simulation).  That  means  that  any  further  improvement  of  migration  scheduling 
beyond  this  point  will  no  longer  significantly  improve  Pro.  Therefore,  the  migration  scheduling  is 
not  a  significant  bottleneck  for  executingij.Executionij  beyond  that  point. 

2,3  Semantic  Specification  of  Service  Migration 

We  have  developed  a  semantic  model  to  formalize  a  service  migration,  its  main  constructs,  and  the 
constraint  rules  on  the  operations  and  interactions  of  the  service  migration  activities.  We  defined  the 
basic  notations  that  describe  the  core  constructs  of  a  service  migration,  which  form  the  baseline  to 
specify  the  semantic  constraints  on  the  activities  of  a  service  migration.  The  service  migration 
constraints  specify  what  system  activities  are  valid  and  what  properties  must  hold  during  and  after  a 
service  migration.  The  constraints  are  defined  in  terms  of  (a)  the  inherent  relationships  and 
interactions  among  system/service  activities  (e.g.,  service  dependency,  resource  provision  and 
requirement)  and  (b)  functional  and  policy  regulations  on  service  migration  activities  (such  as  service 
prioritization  and  platform  restrictions).  The  model  serves  as  a  foundation  for  users  to  specify  and 
validate  the  integrity  and  important  properties  of  a  service  migration.  Some  of  the  basic  service 
migration  constrain  rules  are  listed  in  Table  1. 
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Table  1;  Basic  Service  Migration  Constraint  Rules 


Constraint 

Rule 

Representation 

Explanation 

Destination 

uniqueness 

Vi-,  e5,  Vpj  sP,  Vpi  £P  (SP(si,  Pj)  a 
SP{s„p,)  -^1.)  (jft) 

Every  service  s,  will  eventually  be  migrated  to  no  more  than 
one  platform  pj  or  pt 

Service  migration 
prohibition 

Vsi  eS,  Vpj  eP  (Prh(si,pj)  — > 
--SP(si,pj)) 

A  service  is  specified  to  be  prohibited  to  execute  on  a  platform 
due  to  corporate  policy  (e.g.,  security  regulations),  resource 
restriction  on  the  platform,  service  prioritization,  and/or 
technical  specifications 

Service 

dependency 

constraint 

Vsi  gS,  VS’  (Depisi,  S’)  ^  (3ft 

e5’,  3pj  eP  (5P(ft,  pj)  ^  SP(si,  pj)))) 

If  one  service  Si  ES  is  dependent  on  any  one  service  in  a  group 
of  services,  then  s,  must  be  migrated  to  the  same  platform  pj  as 
the  particular  service  being  depended 

Service  atomic 
constraint 

VS’  e2^  {Atom{S’)  — >  (Vft  eS’,  Vsi 
eS’,  3pj  eP  (SPisi,  pj)  SP{s,, 

Pi)))) 

If  a  set  of  services  are  mutually  dependent  on  each  other  (i.e., 
atomic),  then  all  of  them  should  be  migrated  to  the  same 
platform 

Service  exclusion 
constraint 

Vsi  eS,  VS’  e2^  (Exc(si,  S’)  — >  (Vft  s 
S’,  3pj  sP  ((5P(ft,  pj)  A  SPiSj,  Pj)  ^ 
1))) 

If  a  service  s,  ES  is  exclusive  to  every  service  in  a  group  of 
services,  then  Si  must  not  be  executed  on  the  same  platform 
with  any  one  of  those  services.  One  service  is  exclusive  to 
another  if  they  have  functional/resource  conflicts 

Resource 
provision  and 
requirement 
constraint 

Vft  eS,  Vpj  eP  {SP(si,  pj)  — >  (Vra 
eRes  (RP(pj.  ra)  >  RR{si.  ra)))) 

If  a  service  i,  ES  is  migrated  to  a  platform,  the  available 
resource  ra  provided  by  the  platform  must  be  no  less  than  that 
required  by  Si  for  ra 

Platform  health 
constraint 

Vft  e5,  Vpj  sP  (5'P(ft ,  Pj)  TH(pj)) 

Each  service  must  be  arranged  to  be  migrated  to  a  healthy 
platform 

Based  on  the  semantic  model,  we  have  proposed  an  approach  to  identify  an  optimal  service 
migration  arrangement  with  the  lowest  cost  to  migrate  each  service  to  a  platform  without  violating 
any  of  the  semantic  constraints.  Essentially,  identifying  an  optimal  service  migration  arrangement  is 
reduced  to  a  Pseudo-Boolean  Optimization  (PBO)  problem,  which  is  an  extended  SAT  problem.  The 
idea  is  to  formulate  and  minimize  an  objective  function  ,*CT/  ,  subject  to  a  set  of  pseudo- 

Boolean  constraints,  where  x,j  e{0,  1}  represents  whether  service  Si  is  arranged  to  be  migrated  to 
platform  pj  and  CTij  eZ  represents  the  cost  to  move  Si  to  pj.  In  a  PBO  problem,  each  pseudo- 
Boolean  constraint  is  either  in  a  pure  Boolean  Conjunctive  Normal  Form  or  in  a  format  of 

J=m 

^  (ai.  j  *  xj)  >  bi,  where  atj,  bi  and  x/  e  {0,  1 } . 

7=1 

We  have  developed  an  algorithm  to  automatically  generate  the  set  of  pseudo-Boolean  constraints 
given  the  service  migration  constraints  in  our  semantic  model.  A  set  of  simulations  have  been 
conducted  to  evaluate  the  feasibility  and  efficiency  of  using  the  proposed  PBO  approach  to  identify 
an  optimal  service  migration  arrangement  without  violating  any  of  the  service  migration  constraint 
rules.  We  used  Sat4tj-pb  (www.sat4j.org)  in  our  simulations,  which  is  an  open-source  Java  library 
and  the  wining  program  of  the  Pseudo  Boolean  Competition  2012.  The  simulations  demonstrated  the 
feasibility  of  enforcing  the  semantic  constrains  in  identifying  an  optimal  service  migration 
arrangement  using  the  PBO  approach.  For  performance  evaluation,  we  studied  the  impact  of  the 
number  of  services  to  be  migrated  and  various  constraint  rules  on  the  execution  time  required  to 
identify  an  optimal  service  migration  arrangement. 

2,4  A  Holistic  Approach  for  Service  Migration  Decision  Making 

As  a  systematic  defensive  security  approach,  service  migration  is  a  system-wide  process  and  involves 
multiple  components  of  a  system.  As  the  complexity  of  a  system  and  the  attacking  techniques 
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continuously  grow,  a  well-planned  and  ensured  service  migration  is  necessary  in  order  to  minimize 
any  damage  resulted  from  malicious  attacks.  We  have  developed  a  holistic  approach  for  service 
migration  decisions,  which  manage  and  guide  the  activities  and  procedures  of  a  service  migration 
process.  Our  approach  includes  three  major  decision  components  -  (1)  determining  whether  a  service 
migration  is  the  most  appropriate  course  of  action  to  take  in  case  of  a  malicious  attack;  (2)  deciding 
the  best  strategy  for  a  service  migration;  and  (3)  developing  an  effective  and  efficient  schedule  for 
the  service  migration  activities. 

2.4.1  Belief-based  Decision  Making  for  Service  Migration  Determination 

The  first  fundamental  decision  for  a  service  migration  is  to  determine  whether  a  service  migration  is 
the  most  appropriate  course  of  action  to  take  in  a  security  incident.  This  decision  is  made  based  on  the 
devastating  nature  of  the  attack,  the  damage  already  caused  by  the  attack,  and  system  resources 
available  to  recover  and  defend  against  the  attack.  Service  migration  is  the  most  appropriate  when  the 
attacking  effect  is  so  severe  that  it  is  difficult  to  recover  the  damaged  platforms  quickly  enough  to 
make  the  services  continuously  available  to  users  without  a  noticeable  intermption.  In  this  case,  the 
best  strategy  is  to  migrate  the  services  from  their  compromised  platforms  to  other  clean,  healthy 
platforms  so  that  those  services  can  be  continuously  executed  on  those  new  platforms.  In  this  way,  any 
critical  services  will  still  be  available  to  users  even  when  some  platforms  of  the  system  have  been 
compromised. 


Belief  combination  from 
multiple  intrusion 
detection  agents  about  a 
platform  of  concern 


Probability  distribution  on  all 
combinations  of  the  possible 
damage  states  of  the  platform 


Fig.  4:  Belief-based  service  migration  decision  model 


Making  a  service  migration  decision  must  balance  between  the  cost  of  service  migration  (e.g., 
suspending  current  running  processes,  transferring  the  data  and  service  programs  to  new  platforms, 
and  setting  up  the  services  on  the  new  platforms)  and  the  necessity  of  migrating  services  somewhere 
else  to  avoid  further  losses  (e.g.,  any  direct  and  indirect  costs  resulted  from  the  compromised 
platforms).  A  fundamental  criterion  for  such  a  decision  is  to  evaluate  whether  the  platforms  of 
concern  have  been  severely  damaged.  Assessing  the  damage  status  of  a  platform  is  not  a  trivial  task 
given  the  situation  that  malicious  attacks  have  become  increasingly  complicated  and  system 
resources  available  for  damage  assessment  and  recovery  are  often  limited  in  a  security  incident 
scenario.  Our  approach  for  damage  assessment  of  a  platform  is  to  integrate  multiple  sources  of 
damage  assessment  results  from  several  independent  intrusion  detection  agents.  We  have  developed  a 
transferable  belief-based  decision  model  to  represent  the  damage  assessment  about  a  platform  as 
provided  by  an  intrusion  detection  agent  and  to  combine  multiple  sources  of  assessments  into  an 
integrated,  more  reliable  damage  assessment  result  about  that  platform.  As  shown  in  Figure  4, 
damage  assessment  about  a  platform  by  each  intrusion  detection  agent  is  represented  as  a  basic  belief 
assignment,  i.e.,  a  belief  mass  function  on  the  subsets  of  a  belief  domain.  Belief  combination  rules 
are  applied  to  integrate  multiple  sources  of  beliefs  to  reach  a  comprehensive  belief  assignment  which 
represents  the  final  damage  assessment  of  that  platform.  The  combined  belief  assignment  represents 
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a  probability  distribution  on  all  the  combinations  of  the  possible  damage  states  of  the  platform.  Given 
the  cost  of  performing  a  security  action  (e.g.,  service  migration,  system  repair  and  restoration,  and 
system  mending  and  refurbishment)  on  each  damage  state  of  the  platform,  a  Bayesian  decision  model 
is  developed  to  determine  whether  a  service  migration  is  the  most  effective  and  cost  efficient  action 
to  take.  In  case  the  overall  cost  of  service  migration  is  minimum,  a  decision  justifies  that  service 
migration  is  the  most  appropriate  action  to  take  as  compared  with  other  security  approaches. 

2.4.2  Fuzzy  Inference  for  Service  Migration  Strategy 

Once  a  decision  for  service  migration  is  made,  the  next  step  is  to  determine  which  strategy  to  use  for 
the  service  migration.  In  our  discussion,  a  service  migration  strategy  is  a  specification  about  whether 
the  service  programs,  the  service  state  and  the  data  space  need  to  move  entirely  or  partially  given 
different  security  and  system  situations.  Such  a  strategy  provides  a  high-level  guideline  for  the 
underlying  service  migration  activities  and  procedures  to  carry  out. 


Service  Migration  Strategy  Fuzzy  Inference  System 

Knowledge  Base 

Fuzzy  rules  representing  domain 
expert  knowledge  about  implications 
of  conditions  to  different  service 
migration  strategies 

Meta  Database 

Fuzzy  variables,  fuzzy  terms, 
and  the  membership  functions  of 
the  fuzzy  terms 

Logic  Inference  Engine 

Fuzzy  logic  reasoning  taking  the 
crisp  values  of  the  input  fuzzy 
variables  and  the  fuzzy  rules 

Fuzzification  &  Defuzzification 
Interfaces 

User  interfaces  for  input  and 
output  values 

_ 

_ 

Fig.  5:  Components  of  the  Fuzzy  Inference  System  for  Determining  a  Service  Migration  Strategy 


A  service  migration  strategy  is  determined  based  on  the  damage  degree  of  the  service  programs, 
the  complexity  of  the  service  programs,  and  the  availability  of  network  capacity  to  securely  transfer 
service  programs  and  data  to  their  new  platforms.  In  one  situation,  if  the  service  programs  have  been 
severely  damaged,  they  cannot  be  executed  on  the  new  platforms  and  therefore  should  not  be  moved. 
Rather,  functionally  equivalent  programs  must  be  generated  on  the  new  platforms  in  order  to 
continuously  execute  the  services.  From  a  security  perspective,  the  newly  generated  service  programs 
must  be  resistant  to  the  same  type  of  attacks  occurred  on  the  compromised  platform.  In  another 
situation,  if  the  service  programs  are  only  damaged  with  a  minor  degree  or  even  damage  free,  they  can 
be  readily  used  on  the  new  platform  and  hence  can  be  migrated  entirely.  Regardless  whether  the 
service  programs  are  moved  or  not,  the  service  state  and  the  data  space  must  be  saved  and  moved  to 
the  new  platform  in  order  for  the  services  to  be  resumed  from  wherever  has  been  left  on  their  original 
platforms.  As  a  general  guideline  to  determine  a  service  migration  strategy,  only  a  minimum  amount 
of  data  and  programs  should  be  migrated  whenever  possible.  We  have  identified  the  following  three 
service  migration  strategies: 

•  Heavyweight  migration  -  moving  the  entire  service  programs,  the  service  state,  and  the  data  space 
from  their  current  platform  to  a  new  platform; 

•  Lightweight  migration  -  only  relocating  the  service  state  and  the  data  space  but  not  the  service 
programs.  Since  the  service  programs  are  not  moved,  the  system  must  re-generate  the  service 
code  on  the  new  platforms  so  that  the  service  can  be  continuously  provided  on  the  new  platforms; 
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•  Middleweight  migration  -  moving  part  of  the  service  programs,  along  with  the  service  state  and 
the  data  space  to  the  new  platforms  and  generating  the  remaining  unmoved  service  program 
components  on  the  new  platforms  in  order  to  execute  the  service  programs. 

We  have  developed  a  fuzzy  inference  system  to  determine  a  service  migration  strategy.  Our 
approach  uses  expert  knowledge  as  linguistic  reasoning  rules  and  takes  service  programs  damage 
assessment,  service  programs  complexity,  and  available  network  capability  as  input.  The  fuzzy 
inference  system  includes  four  components  as  shown  in  Figure  5:  (1)  a  knowledge  base  containing  a 
set  of  fuzzy  rules  that  represent  domain  expert  knowledge  about  the  implications  of  conditions  to  a 
service  migration  strategy.  Each  rule  is  represented  in  linguistic  fuzzy  terms  in  a  format  of  If-Then 
statement,  indicating  the  assumptions  and  the  consequence  of  a  logic  implication;  (2)  a  meta  database 
containing  fuzzy  variables,  fuzzy  terms,  and  the  membership  functions  of  the  fuzzy  terms;  (3)  a  logic 
inference  engine  for  fuzzy  logic  reasoning  taking  the  crisp  values  of  the  input  fuzzy  variables  and  the 
fuzzy  rules.  In  a  logic  reasoning  process,  there  is  no  typical  “order”  for  the  rules  to  be  applied.  The 
logic  engine  evaluates  the  values  of  the  input  fuzzy  variables  and  determines  the  rules  that  those 
input  values  match  their  conditions.  All  the  matching  rules  are  applied  regardless  of  the  order  that 
they  appear  in  the  rule  base.  Methods  for  condition  aggregation,  fuzzy  rule  activation  and  multi-rule 
result  accumulation  are  also  defined  in  this  component  for  inference  reasoning;  and  (4)  a 
fuzzification  and  a  defuzzification  user  interfaces  for  input  and  output  values.  Fuzzification 
determines  the  mapping  of  each  input  crisp  value  with  the  linguistic  terms  of  the  fuzzy  variable 
taking  this  value.  Defuzzification  converts  the  fuzzy  inference  result  set  to  a  crisp  value  for  each 
output  variable. 

We  use  jFuzzyLogic  to  simulate  the  fuzzy  inference  system  for  service  migration  strategy. 
jFuzzyFogic  is  a  Java  implementation  of  a  fuzzy  logic  software  package,  which  implements  a 
complete  fuzzy  inference  system  as  well  as  fuzzy  control  logic  compliance  according  to  lEC  61131-7 
(formerly  1131-7).  The  definitions  of  the  fuzzy  variables  of  our  service  migration  inference  system 
are  encoded  in  Fuzzy  Control  Fanguage  (FCF).  Preliminary  results  show  that  the  proposed  fuzzy 
inference  system  is  effective  in  determining  the  most  appropriate  strategy  for  service  migration  given 
a  security  incident  scenario. 

2.4.3  A  Logic  Approach  for  Service  Migration  Scheduling 

One  of  the  important  activities  of  a  service  migration  is  service  scheduling,  which  generates  an 
effective  arrangement  for  each  service  with  a  high  level  of  priority  to  migrate  from  a  compromised 
platform  to  another  healthy  one.  Before  any  resources  are  allocated  for  service  migration  and  any 
service  migration  activities  can  start,  there  must  be  an  efficient  scheduling  to  determine  which 
service  to  be  migrated  to  which  platform. 

We  have  proposed  a  logic  approach  for  service  migration  scheduling.  The  logic  constructs  and 
inference  rules  have  been  developed.  Using  a  logic  approach  makes  it  flexible  to  incorporate  new 
constraint  rules  and  also  provides  a  formal  method  to  analyze,  evaluate  and  verify  the  correctness  of 
the  approach.  Given  the  limited  resources  available  in  a  security  incident  and  the  high  requirement 
for  continuous  function  of  the  services,  service  migration  scheduling  must  be  completed  timely;  but 
in  the  meantime,  the  service  migration  arrangement  is  subject  to  various  constraints:  (1)  any  new 
platform  to  host  the  migrating  services  must  possess  the  required  capabilities  and  resources  to 
support  the  functionality  of  those  services;  and  (2)  the  inherent  relationships  among  the  migrating 
services  (e.g.,  dependency  or  exclusion)  on  their  original  platforms  must  be  maintained  on  the  new 
platforms.  In  our  approach,  those  requirements  are  represented  as  a  set  of  constraints  in  a  logic 
reasoning  process  in  order  to  generate  a  valid  service  migration  schedule.  The  interplay  between  the 
constraint  domain  and  the  logic  reasoning  is  implemented  through  a  set  of  inference  rules. 
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In  the  implementation,  the  service  migration  constraints  are  enforced  by  a  set  of  general  but 
expressive  rules.  Proof  obligations  and  proof  restrictions  are  generated  when  the  corresponding 
constraint  rules  are  applied.  To  show  that  the  logic  reasoning  will  not  violate  any  of  the  constraint 
rules,  a  scheduled  service  must  not  be  prohibited  as  indicated  by  a  proof  restriction.  In  the  meantime, 
proof  obligations  are  generated  as  a  result  of  applying  some  constraint  rules.  Those  proof  obligations 
must  be  satisfied  by  some  proof  elements  in  order  for  the  entire  scheduling  process  to  be  successful. 
A  proof  solution  represents  a  feasible  scheduling  of  a  service  on  a  particular  platform,  which  can  be 
used  to  discharge  some  proof  obligations.  We  defined  a  self-contained  data  structure  to  specify  a  set 
of  constraints,  to  create  proof  obligations  and  proof  restrictions  when  certain  constraint  rules  are 
applicable,  to  verify  that  a  reasoning  step  does  not  violate  the  proof  restrictions,  and  to  generate  proof 
solutions  to  solve  those  proof  obligations. 

To  validate  the  proposed  logic  approach  for  service  migration  scheduling,  we  have  developed  a 
proof-of-concept  logic  program  to  automatically  schedule  a  set  of  services  to  be  migrated  to  a  set  of 
platforms  subject  to  a  set  of  constraint  rules.  The  logic  program  takes  a  set  of  services  and  platforms 
as  input  and  generates  an  arrangement  for  each  service  to  be  migrated  to  a  platform,  or  reports  a 
failure  if  no  such  arrangement  exists  or  some  constraints  cannot  be  satisfied.  The  program  was 
implemented  in  JProlog.  We  run  a  set  of  simulations  to  verify  the  correctness  and  efficiency  of  the 
program.  All  the  inference  rules  have  been  validated.  The  results  show  that  the  logic  program  has 
successfully  identified  all  the  valid  arrangements  to  migrate  the  services  to  available  platforms  given 
a  set  of  constraints. 

2.4.4  Mobile  Agent-based  Service  Migration  Simulation 

Mobile  agents  are  special  software  agents  that  move  spontaneously  across  multiple  hosts  of  one  or 
more  networks.  In  case  of  malicious  attacks,  mobile  agents  can  move  from  their  damaged  platforms 
to  other  clean,  healthy  platforms  so  that  the  services  they  offer  can  be  continuously  provided  on  the 
new  platforms,  thus  achieving  service  migration.  Service  migration  through  such  strategic  agent 
movement  helps  a  system  survive  host  damage  and  improves  service  availability.  We  propose  a 
mobile  agent-based  approach  for  service  migration,  where  a  group  of  agents  collaboratively  decide  a 
migration  plan  to  relocate  from  their  current  platforms  to  other  more  secure  and  reliable  platforms. 

We  specify  the  system  architecture  to  support  agent  migration  and  propose  a  collaborative 
decision  making  model  for  a  group  of  agents  to  decide  their  destination  platforms  in  a  migration 
process.  Since  agents  are  social  entities,  they  collaboratively  work  with  others  on  certain  tasks. 
Therefore,  one  agent  may  functionally  depend  on  other  agents.  A  service  migration  plan  must  take  this 
type  of  dependency  into  consideration.  An  algorithm  for  collaborative  migration  decision  making  has 
been  developed,  which  is  executed  by  a  coordinator  agent.  The  algorithm  takes  as  input  the  local 
migration  decisions  {Si,  ...  Si,  ...,  Sm}  from  all  the  m  agents  in  a  group  and  a  set  of  constraint  rules  R 
for  the  agent  group.  The  output  is  an  agent  migration  plan.  Three  types  of  constraint  rules  are  specified 
inR:  (1)  atomicity  rule  -  a  set  of  agents  must  be  migrated  to  the  same  platform,  (2)  dependency  rule  - 
an  agent  functionally  depends  on  at  least  one  of  subset  of  agents;  therefore,  it  must  migrate  to  the 
same  platform  as  the  agent  which  it  depends  on,  and  (3)  exclusion  rule  -  two  or  more  agents  must  not 
be  migrated  to  the  same  platform.  Basically,  the  collaborative  migration  decision  algorithm 
recursively  takes  one  platform  from  each  list  of  feasible  platforms  provided  by  one  agent,  called  a 
tentative  migration  plan,  and  then  checks  this  tentative  plan  against  each  constraint  rule.  If  any  mle  is 
violated,  this  tentative  plan  is  not  feasible  and  the  algorithm  moves  to  the  next  tentative  migration  plan 
until  a  feasible  plan  is  identified  or  all  the  possible  tentative  plans  have  been  exhausted.  A 
collaborative  migration  plan  makes  sure  that  the  agent  migration  will  not  violate  any  of  the  constraints 
for  the  group  of  agents. 
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To  verify  the  proposed  agent  migration  scheme,  we  have  developed  a  proof-of-concept  mobile 
agent  system  based  on  Aglets,  a  Java  based  agent  platform  and  library  for  building  mobile  agent-based 
applications.  An  aglet  is  a  Java  agent  which  is  specifically  designed  to  support  mobility,  i.e.,  allowing 
an  agent  to  migrate  across  the  hosts  of  one  or  more  networks.  Our  simulation  includes  local  and 
collaborative  agent  migration  decision  making  as  well  as  the  actual  agent  dispatching  to  their 
destination  platforms.  The  result  demonstrates  the  feasibility  and  efficiency  of  moving  agents  from 
one  platform  to  another  using  a  mobile  agent  platform. 

2,5  A  Constrained,  Possibilistic  Logic  Approach  for  System  Survivability  Evaluation 

We  have  also  developed  a  logic  approach  to  facilitate  users  in  assessing  a  software  system  in  terms  of 
the  required  survivability  features.  Survivability  evaluation  is  essential  in  linking  foreign  software 
components  to  an  existing  system  or  obtaining  software  systems  from  external  sources.  It  is 
important  to  make  sure  that  any  foreign  software  components  will  not  compromise  the  current 
system’s  survivability  properties.  Given  the  increasing  large  scope  and  complexity  of  modern 
software  systems,  there  is  a  need  for  an  evaluation  framework  to  accommodate  uncertain,  vague,  or 
even  ill-known  knowledge  for  a  robust  evaluation  based  on  multi-dimensional  criteria.  Our  approach 
incorporates  user-defined  constrains  on  survivability  requirements.  Necessity-based  possibilistic 
uncertainty  and  user  survivability  requirement  constraints  are  effectively  linked  to  logic  reasoning.  A 
proof-of-concept  system  has  been  developed  to  validate  the  proposed  approach. 

In  our  logical  approach  for  survivability  evaluation,  the  user’s  survivability  requirements  are 
represented  in  a  logic  with  application  specific  operators  and  inference  rules.  A  system’s  compliance 
with  those  requirements  are  checked  through  a  logic  reasoning  process.  Applying  a  formal,  logic- 
based  approach  provides  a  rigorous  verification  and  guarantee  of  system  properties  in  a  well- 
structured  reasoning  process.  The  design  of  the  logic  evaluation  framework  follows  the  following 
principles  and  guidelines: 

(1)  Since  survivability  is  a  multi-dimensional  concept,  a  software  system’s  properties  need  to 
be  evaluated  from  different  aspects,  including  security,  adaptability,  robustness,  and  fault  tolerance; 

(2)  Given  the  increasing  scope  and  complexity  of  modern  software  systems,  it  is  virtually 
impossible  for  a  user  to  evaluate  every  property  of  a  system.  For  an  objective  and  accurate 
assessment  about  a  system’s  survivability  features,  third-party  trusted  evaluators  can  be  used  who  are 
specialized  in  some  particular  aspects  of  system  survivability  features.  Our  approach  supports 
collecting  survivability  property  certificates  from  trusted  evaluators,  encoded  as  logic  formulas, 
reasoning  on  those  individual  assessments  through  a  logic  proof  process,  and  integrating  them  into  a 
complete  survivability  evaluation  result; 

(3)  It  is  often  the  case  that  even  a  specialized  evaluator  cannot  be  very  certain  about  a  particular 
feature  of  a  software  system.  Our  approach  supports  logic  reasoning  on  uncertain,  imprecise,  or  even 
vague  information.  This  uncertainty-aware  reasoning  is  achieved  by  defining  many-valued  logic 
formulas  and  necessity-based  possibilistic  uncertainty,  where  uncertain  information  can  be  formally 
represented  and  linked  to  a  logic  reasoning  process.  The  proposed  approach  makes  it  possible  to 
express  fuzzy  pattern  matching  in  formal  survivability  proof; 

(4)  An  evaluation  framework  should  be  applicable  to  practical  case  scenarios.  In  terms  of  users’ 
system  property  requirements,  it  should  have  a  mechanism  to  represent  and  reason  about  constraints 
on  the  required  survivability  features  of  a  software  system.  Some  system  properties  may  take  others 
as  their  pre-requisite  conditions.  For  example,  the  system’s  self-healing  ability  depends  on  an 
accurate  and  timely  damage  assessment.  As  another  example,  the  capability  of  a  system  to 
reasonably  predict  the  causes  of  system  faults  and  take  the  corresponding  corrective  actions  to 
recover  from  damage  is  closely  related  to  the  system’s  ability  to  control  vulnerability.  Therefore, 
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both  of  those  two  properties  may  be  required  for  the  system.  Incorporating  those  and  other 
constraints  and  allowing  an  efficient  connection  between  a  constraint  domain  and  a  logic  reasoning 
process  is  essential  for  a  survivability  evaluation  framework  to  be  practical.  Our  logic  framework 
supports  constrained  logic  reasoning  to  accommodate  these  and  other  types  of  constraints. 

The  designed  logic  supports  fuzzy  pattern  matching  for  survivability  evaluation  uncertainty 
reasoning  and  user  requirement  constraint  specification  and  verification.  We  present  a  logic 
mechanism  to  incorporate  survivability  requirement  constraints  and  possibilistic  uncertainty  to 
software  system  survivability  evaluation.  A  formal  design  is  presented  to  link  the  hybrid  worlds  of 
constraint  domains  to  logic  reasoning.  The  interplay  between  the  constraint  checking  and  logic 
reasoning  is  supported  by  a  set  of  logic  inference  rules.  To  make  sure  that  the  logic  inference  rules 
are  correct,  we  have  developed  a  prototyping  theorem  prover  implemented  in  JProlog.  The  logic 
engine  is  encoded  in  Prolog.  We  have  conducted  a  set  of  experiments  for  system  survivability 
evaluation.  All  the  logic  inference  rules  have  been  validated. 
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